Goto

Collaborating Authors

 predictive model


Scalable Decision-Focused Learning through Cost-Sensitive Regression

arXiv.org Machine Learning

Many real-world combinatorial problems involve uncertain parameters, which can be predicted given contextual features and historical data. These `predict-then-optimize' or `contextual optimization' problems have gained significant attention: end-to-end training methods can now minimize the downstream task cost rather than the predictive error. However, despite their effectiveness, these decision-focused learning (DFL) approaches often rely on repeated solving of the underlying combinatorial optimization problem during training, making them computationally expensive and difficult to scale. We reframe the learning problem as a cost-sensitive multi-output regression problem: multi-output due to the combinatorial problem having multiple uncertain parameters, and cost-sensitive due to the downstream task cost being the real target. Our technical contribution is the formalization of multiple loss function components that follow from this reframing: cost-insensitive normalization, decision-aware asymmetric penalization of over- and underpredictions, and instance-based costs that mimic the true downstream task-based loss locally. These components require zero or one solve per training data instance, while requiring no further solves during training. Experiments show that the combination of loss components achieves comparable downstream task quality to the state of the art, while being significantly more efficient, enabling scaling to problem sizes that have not been tackled before with DFL.


Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification

arXiv.org Machine Learning

Bayesian predictive inference propagates parameter uncertainty to quantities of interest through the posterior-predictive distribution. In practice, this is typically performed using a two-stage procedure: first approximating the posterior distribution of model parameters, and then propagating posterior samples through the predictive model via Monte Carlo simulation. This sequential workflow can be computationally demanding, particularly for high-fidelity models such as those governed by partial differential equations. We propose a variational Bayesian framework that directly targets the posterior-predictive distribution and jointly learns variational approximations of both the posterior and the corresponding predictive distribution. The formulation introduces a variational upper bound on the Kullback--Leibler divergence together with moment-based regularization terms. The variational distributions are trained in an amortized manner, shifting computational effort to an offline stage and enabling efficient online inference. Numerical experiments ranging from analytical benchmarks to a finite-element solid mechanics problem demonstrate that the proposed method achieves more accurate predictive distributions than conventional two-stage variational inference, while substantially reducing the cost of online predictive inference.


4c4c937b67cc8d785cea1e42ccea185c-Supplemental.pdf

Neural Information Processing Systems

Proof of Proposition 1. Due to Jensen's inequality and the fact that, by assumption, the distribution of human predictions P(h|x) is not a point-mass, it holds that Eh[`(h(x),y) |x] > `(µh(x),y). Proof of Theorem 3. We first provide the proof of the unconstrained case. Note that the above problem is a linear program and it decouples with respect to x. Therefore, for each x, the optimal solution is clearly given by: π m(d= 1 |x) = 1 if Ey|x[`(m(x),y) Eh|x[`(h,y)]] >0 0 otherwise Next, we provide the proof of the constrained case. To this aim, we consider the dual formulation of the optimization problem, where we only introduce a Lagrangian multiplier τP,b for the first constraint, i.e., maximize Ex π(x) Ey,h|x[`(h,y)] Ey|x[`(m(x),y)] + Ex [τP,b(π(x) b)] (13) subject to 0 π(x) 1 x X. (14) 13 The inner minimization problem can be solved using the similar argument for the unconstrained case.






Experimental Setup

Neural Information Processing Systems

We provide an extended version of the Experimental Setup from Section 5 below. Linear Model This domain involves learning a linear model when the underlying mapping between features and predictions is cubic. Concretely, the aim is to choose the top B =1 out of N = 50 resources using a linear model. The fact that the features can be seen as 1-dimensional allows us to visualize the learned models (as seen in Figure 4). Predict: Given a feature xn U[0,1], use a linear model to predict the utility ˆyof choosing resource n, where the true utility is given by yn = 10x3n 6.5xn.



Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos

Neural Information Processing Systems

Understanding how biological visual systems process information is challenging because of the nonlinear relationship between visual input and neuronal responses. Artificial neural networks allow computational neuroscientists to create predictive models that connect biological and machine vision.Machine learning has benefited tremendously from benchmarks that compare different models on the same task under standardized conditions. However, there was no standardized benchmark to identify state-of-the-art dynamic models of the mouse visual system.To address this gap, we established the SENSORIUM 2023 Benchmark Competition with dynamic input, featuring a new large-scale dataset from the primary visual cortex of ten mice. This dataset includes responses from 78,853 neurons to 2 hours of dynamic stimuli per neuron, together with behavioral measurements such as running speed, pupil dilation, and eye movements.The competition ranked models in two tracks based on predictive performance for neuronal responses on a held-out test set: one focusing on predicting in-domain natural stimuli and another on out-of-distribution (OOD) stimuli to assess model generalization.As part of the NeurIPS 2023 Competition Track, we received more than 160 model submissions from 22 teams. Several new architectures for predictive models were proposed, and the winning teams improved the previous state-of-the-art model by 50\%. Access to the dataset as well as the benchmarking infrastructure will remain online at www.sensorium-competition.net.